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ABSTRACT 

A study investigated the usefulness of non-native 
speakers* subjective, relative word frequency estimates as a measure 
of second language proficiency. In the experiment, two subjective 
frequency estimate (SFE) tasKs, one on French and one on English, 
were presented to French learners of English (n=l26) and American 
learners of French (n-87) . Subjects were university students studying 
in France. Each group received lists of 30 words (nouns and 
adjectives only) , drawn from published frequency lists and presented 
alphabetically. Instructions to rank-order the words for frequency 
were given to each group in its native language. Results suggest the 
English list was easier to rank-order. In addition, while the 
performance of native speakers was better than that of non-native 
speakers on the English list, non-natives performed slightly but not 
significantly better on the French list. The inconclusive results 
suggest that SFEs can not provide indirect second-language (L2) 
proficiency measures. The close relationship of the two languages is 
seen as a possible confounding variable. The better performance of 
the American students is ascribed to the selectivity of the students' 
home institution. (MSE) 
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1. IVTBODUCTICnr 



The wordG of a language occur In discourse with different 
frequencies, which can be counted in a corpus In order to 
establish frequency lists* That frequency of occurrence Is 
soiwhow attached to the mental representations of words 
appears clearly In psycho linguistic research, for Instance In 
experiments on memorization <see e. g* Gregg, Montgomery 4k 
Castafio 1960) or word association tasks (see e«g. Howes 1057) « 
In addition, It certainly plays a part In comprehension, which 
appears In the link between average word frequency and 
readability (see e.g. Klare 1968) • 

In a variety of experiments, Ss have been asked to i>erform 
tasks resting explicitly on word frequencies* These 
experiments have shown that native speakers are able to 
provide word frequency estimates that correlate well with 
objective data. In this area, a line of research that Is of 
Interest to specialists of foreign- language learning consists 
In exploring the ability of non-natives to provide subjective 
frequency estimates (henceforth SFE*8>, which might provide us 
with insights into the mental lexicon of language learners 
and, in case reliable differences with natives are found, 
allow us to set up indirect proficiency tests. 

After reviewing the literature on LI subjective frequency 
estimates , and then the far less numerous experlir^nts on L2 
estimates, I shall present an experiment in which two 
Identical SPB tasks, one on French and the other or^ English 
words, were presented to French learners of English and 
American learners of French. 



2. PBEVIOHS BBSBABCH 



Before reviewing existing research, it is useful to 
consider the different experUtentel paradlgns available in SFE 
investigations. The two SFE mathods can be termed atmolute and 
reJative. In the case of the absolute aethod. Ss are requested 
to provide frequency assessments for separate items, such as 
"frequently used, hardly ever used" or "used once a month, 
once a week", etc. In the other case, that of the ralative 
method, the Ss have to work on a list of words: they may have 
to provide a frequency figure for each item (in which case an 
anchoring value nay be supplied for the first item), or else 
they may have to reclassify for frequency a list of words 
presented in random order. 

Tryk cl968> assembled a list of 100 English words by 
logarithmic sampling of the Thorndike and Lorge <1972) list. 
Fifty students were required to provide estimates of the "once 
a week" (I.e. absolute) type relative to what they thought was 
a) the average American's usage and b) their own usage. The 
task was repeated after a five-week interval. Test-retest 
reliability was very high (.96 and .98); the correlations 
between the four sets of SFBs and the Thorndike and Lorge 
frequencies ranged between .74 and .78. In view of the high 
reliabilities of the SFBs, Tryk concluded that they provide 
information different from that available in frequency lists. 

An experiment by Shapiro (1969) was much more complex and 
cannot be presented in much detail here. Shapiro used the two 
relative methods (no anchoring value in the first case). 
Different groups of Ss had to work on lists of various 
lengths. Among Shapiro's many findings, the following are of 
particular interest here: a) the two variants of the relative 
method provided very similar results; b) SFBs provided by 
subjects in different age groups were comparable; c> 
coirelations between the subjective orderings and objective 
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ones <Kucera ft Francis 1967 and Thorndlke & Lorga 1972) ranged 
b«t«ie«n .92 and .975. 

Carroll <1971> used roughly tho eaaa methods as Shapiro. 
Hie subjects vfsre a group of 15 professional lexicographers 
and one of 13 non-special Ists. The correlations between the 
values obtained for each iford and those In the Carroll, Davles 
and Richaan <1971> list were computed. There appeared a highly 
significant difference in precision between the SFEs by the 
lexicographers and those by the other group. In addition, the 
correlations between SPBs and published data were .92 for the 
non-special Ists and .97 for the lexicographers, also a highly 
significant difference. Carroll, noting the discrepancies 
between the SFEs and the data from the objective word counts, 
claimed that the two methods do not measure the same thing. 
According to him, subjective data are more valid than 
objective ones, because the latter are subject to various 
sampling biases which do not affect the human mind. Carroll 
also concluded that subjective frequencies have more 
psychological relevance. 

In a large-scale experiment, Richards <1974> had 1000 
Canadian students provide absolute estimates on a total of 
4,495 "concrete" words from a dictionary presented In lists of 
50 items. The words were then rank-ordered for "familiarity", 
and, for a sub-set of 2,496 nouns, the rank-order correlation 
with the data of the Ku&era and Francis (1967) list was .575, 
which Is highly significant, but lower than the results 
publl'^hed by other authors. 

Rlngellng (1964), In an experiment which -^111 be mentioned 
again further on since It Involved a group of non-natives, 
hypothesized that the discrepancies between SFBs and published 
frequency counts might be due to the fact that the Ss did not 
hava clear enough Instructions. He asked his subjects <ai40ng 
whom 5 natives) to rank-order 24 English words by frequency a) 
in the language and b) In their personal linguistic 
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envlrona»nt. Ri&galing. like his prsdscessora, obs«rv«d high 
corr«lations b«tw«ttn the SFBs and the objective ranklngA 
<CarroXl A al. 1971); In addition, the correlatione were 
slightly lower in the "personal** condition than in the 
"language" condition, which, to quote the author, " tmDtativmly 
confirms tb» idam that the Sm did not troat the two taeke as 
onm,** 

In an experiaent <Arnaud 1989) conducted on French 
university students who were requested to rank-order one or 
two lists <A and B> of 30 French words, the following results 
obtained; a) test-retest reliability on list A with a five- 
week interval was .60 <V » 51) i b> the aadian rank-order 
correlation between SFBs and objective data was .04 (V » 322> 
for list A and .79 (H « 119) on list B; o) the correlation 
between students' Individual ranking scores on lists A and B 
was .55 <]r » 119), which provides a measure of concurrent 
validity; d) finally, it was found that the students who had 
provided the order ings of list A closest to that in the 
published frequency list were those very students who had 
obtained the highest test-retest correlation (the correlation 
between the two neasures was highly significant at .53). This 
last result is interesting insofar as it shows that there 
exist large individual differences in SFB perf ornance, and 
that the subjects who provide the "best" rank-order ings also 
provide the stablest ones. Carroll's stateisent should be 
reviewed in this light, as it seeas that not all subjects can 
be reliable informants when word frequencies are concerned. 
Another finding in ny experiment was that there was no 
significant correlation between SFB scores and scores on a 
word- knowledge test, which seeme to indicate that awareness of 
frequencies and vocabulary size are two distinct dimensions of 
lexical competence. 

SFBs by non-natives have been the object of far less 
research, although there was a period In the history of 
language testing, between the structuralist-psychometric 
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period and the beginnings of communicative testing, when it 
vfas widely thought that indirect tasks of a psycholinguistio 
nature could provide reliable and valid measures of foreign- 
language proficiency. 

In an experiment summarized by Upshur <1979), Thrasher 
(1973) compared relative SFBs of learners and natives on a 
list of 60 English verbs of relatively high frequency. Inter- 
rater reliabilities were high for the natives: .88 for a group 
of five adults and somawhat lower for the non-natives, since 
they ranged between .40 and .645 . The native SFBs correlated 
with the Carroll et al. (1971) data at the .40 level for 
children and .695 for adults, and those of learners ranged 
between . 64 and . 75 Thrasher also found that SFEs by the 
more advanced learners were closer to those by the native 
controls. 

Upshur* s <1975) methodology, as presented in hie Ph.D. 
dissertation, was highly complex and I can only present the 
bare essentials here. The author's aim was to determine 
whether the SFEs of learners improve with proficiency and can 
thus be used as indirect proficiency measures. The subjects 
were Spanish learners of English, and tLe frequency data were 
obtained In Eaton's <1967) quadrlllngual list. One problem was 
the closeness of a pair of languages like Spanish and English, 
and Upshur concentrated on words whose equivalent in the other 
language had a significantly different frequency. Words were 
presented in groups of three, among which the Ss had to 
Indicate which one was the most frequent. Four tasks were 
assembled: Houns, Verbs x English, Spanish. Scoring was 
extremely complex and was relative to native speaker 
performance. The 58 Ss also took a battery of four tests, 
three multiple-choice, discrete-item ones and one more 
communicative in nature. Of all the correlations between SFB 
scores and test results, only one reached the .05 significance 
level; in addition, biographical variables Including a stay in 
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an Bngl lab- speaking environiBent w«r« not found to be 
significantly correlatad with SFB perf or nance . 

In his already aentioned study, Ringellug <19d4> conpared 
the perforaances of 5 native speakers of English and 5 very 
advanced Dutch learners. There vfas no difference in 
perfornance bet>feen the natives and the non-neitivee in the 
"language** condition, but the Dutch answers diverged from the 
native ones in the "personal** condition. 



3. EXPBSIXBFr 

Given the scarcity of results on non-native SFEs and their 
rather inconclusive nature, I decided to extend my experinent 
on native SFE's to two languages with natives and learners in 
both cases, with the following research question: do natives 
provide reliably more accurate SFEs than non-natives? 

As the Ss would be university students available during 
normal teaching time, the tasks had to be simple and feasible 
in a short tim interval. Rank-ordering for frequency of a 
list of 30 words presented in alphabetical order appeared 
through pretesting to correspond to these specifications; if 
one remembers the taxonomy of tasks presented earlier, this is 
a relative method. SEE performance was to be assessed by way 
of the rank-order correlation (Spearman's rho) between the 
ordering provided by each S and that available on a published 
frequency count. The number of words Included in the tasks, 
30, was chosen because, as I have Just indicated, the task vfas 
found to take a reasonable amount of time, but also because 30 
is the number of rank pairs at which Spearman' s rho begins to 
have a normal distribution (see Guilford ft Fruchter 1978:295). 
This made it possible to consider tbe Spearman's rho between 
each S'e rank-ordering and the criterion ordering as a normal 
SFB score. 
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The French llet (see Appendix) was assembled by logarithmic 
sampling of the Juilland, Brodin and Davidovitch (1970) list. 
It included only nouns and adjectives since high-frequency 
grammatical words were not included as pretesting had shown 
that their presence resulted in too easy a task with the risk 
of a ceiling effect; verbs were not Included either, as they 
posed unsurmountable lemmatization problems. As the criterion 
for the Ss's SFEs was to be the rank-order in the published 
frequency count, this was checked against the ordering 
provided by another published li8t« that of the rr^sor de Ja 
languB frangalsB (Etudes 1971). The correlation between the 
rank orders on the two lists was «94; this figure will have to 
be kept in mind as a reliability limit for the SFE scores. 

The English list (see Appendix) was gathered following a 
si mi 1 ar procedure , The sou rce was the was the Carrol 1 e t a 1 « 
(1971) list. Two other published frequency lists (Ku£era ft 
Francis 1907; Hofland A Johansson 1962) were available for 
verifying the reliability of the ordering provided by the 
Carroll et al. list, and three comparisons were thus possible: 
the correlations were .93, .96 and .97 (Table 1). In order to 
Improve the reliability of the criterion, the frequencies in 
the three lists were added and the words re -ordered, thus 
reducing the discrepancies between the lists. 

For each of the two experimental lists, the French one and 
the English one, a French and an English version were prepared 
in which the instructions were printed in that language. The 
alphabetical lists were printed on the left-hand side of A4 
sheets, and the Ss were requested to write their re-ordering 
in a column with 30 numbered lines on the right side of the 
sheet* A central space was left empty for the Ss* s use* 

The French Ss wsre 126 first-year university students 
engaged in various fields of the humanities. All had studied 
Engl i sh for seven years i n secondary school s . The Eng 1 i sh- 
spsaking Ss were 87 American sophomores from Dartmouth College 
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who wore following a tmm of studies abroad at the UnlvarsltA 
LualAre; as only snail groups stay in Lyon at a tlDs, it was 
necessary to test three different groups over a year to reach 
a sufficient number of subjects. 

The Sb were given sheets with instructions in their native 
language. They worked first on the list in their native 
language, and took the L2 task the following week. The tasks 
were coapleted in 20 am for the slowest Ss, other subjects 
requiring considerably less tine; it would be an interesting 
direction for further research to deteraine whether there ie a 
link between speed and the quality of the SPBs. 

Results are reproduced at Table 2. It appears that the 
Bngllsh list was easier to rank-order than the French one. As 
the distributions of SPB scores were not noraal, the nndian is 
indicated; in addition, the significance of differences 
between groups was calculated using the nadian test (see 
Guilford ft Fruchter 1976:216-17). On the Bngllsh list, the 
perfornance of the native speakers was superior to that of the 
non-natives (chi^ = 16.435, p<.001); on the French list, 
however, the non-natives also performed better, although not 
significantly so <chl^ = .08). 



4. DISCOSSIOV 

The results would have been conclusive if and only if the 
natives had performed significantly better than the non- 
natives In both cases, which did not happen. Considering the 
rather uncertain results of other experiments, it seems 
reasonable to conclude that SFBs cannot provide Indirect L2 
proficiency measures. 

Ve Are left, however, with the task of explaining these 
results. The first explanation that cones to mind, and one 
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that, to be fair, had not escaped Upehur» le the proximity of 
the conceptual system of two languages like French and 
English, In addition, the statue of SFB tasks Is not entirely 
clear: what is It that goes on In the Ss's i&lnds while they 
work on an SFB task? Are they ordering the words purely In 
terms of the frequency of their occurrence In the language 
they have been exposed to, or are they also taking Into 
account the frequencies of the designate ot these words In 
their environment? It Is not unreasonable tc think that the 
two strategies are Inextricably nixed* O'ue of the results of 
Carroll's <1971> study may reflect this: lexicographers were 
found to provide more precise SFBs, which may result from a 
better ability to sort out the linguistic from the real-world 
aspects* Vhatever the case may be, the frequencies of 
occurrence of Items In the French and English lists and their 
equivalents In the other language were compared* This was done 
deliberately In the most subjective way: for each word of the 
two lists, the first equivalent that cama to mind was 
retained. In cases when there was no clear » unl vocal 
equivalent, the Item was not taken Into account* This left me 
with 23 Items from the French list and 2Q Items from the 
English list* For the English equivalents to the French list, 
the frequency data from the three already mentioned lists were 
combined, and for the French equivalents to the English list, 
the TrSsor de la langum frampaise data %fere used. The rank 
order Inge of original words and their equivalents were then 
correlated. For the 23 surviving items of the French list, the 
correlation (Spearman's rho> was «84; the corresponding figure 
was « 8d for the English list. These results are co]q>arable to 
an observation by Klrsner, Smith, Lockhart and King (1964) 
who, In the preparatory phase of an experiment on 
bllinguallsm, had found a rank-order correlation of .64 
between the frequencies of 116 English words and their French 
equivalents* It appears clearly that a subject whose strategy 
had consisted In relying on LI equivalents when performing the 
L2 task would still have been able to get a good SFB score* 
Incidentally, my earlier statement about the unsultablllty of 
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SFB tasks as proficiency asaeures nsads perhaps to be 
qualified until further research has been done on less closely 
connected pairs of languages, since such a strategy night 
prove less effective in such a case. 

The reader nay reaenber that ay previous experiment had 
shown the existence of considerable individual differences in 
LI SFB ability, which appeared in closeness to objective data 
and stability over tla». If the Ss soaehow or other relied on 
LI equivalents to perforn the L2 tasks, there should be a 
relationship betv«een LI and L2 SFB scores of individual Ss. 
This is indeed tha case, and the product-aoinent correlation 
was .33 for the French subjects and .30 for the Anerican ones, 
both highly significant. 

There remains to be explained the fact that the Anerican 
subjects perforMod better overall than their French 
counterparts. A slnple answer nay be provided: Dartmouth 
College is a highly selective institution, whereas the 
huMnitles departments of French universities are open to 
anyone with a baccalaur6at, selection being prohibited by law; 
in addition, the hunanltlos do not In general attract the most 
motivated students from the secondary schools. There Is little 
doubt that the difference in overall ability is sufficient to 
explain the differences in performance. 



5. GcncLasiOT 

A practical conclusion that can be drawn from this 
experiment Is that SFB tasks do not constitute an interesting 
direction for the development of indirect LI proficiency 
tests, at least when the LI and L2 are closely related. 

Of a more fundamental interest is the fact that learners 
apparently resort to strategies that Involve the vocabulary of 
their native language when faced with a metalinguistic task on 
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Tabls 1 

Corr«latloiui batw««ik rank orders for tba 30 English words 

Carroll 4 al. 1071 / Kufisra h Francis 1967 : .93 
Carroll & al. 1971 / Hofland ft Johansson 1982 : .96 
Kufisra ft Francis 1967 / Hofland ft Johansson 1962 : .97 



Tabls 2 
SFE scorss (rho's> 



Ss 




SSSS3S: = BBX««««3«S3'BXB3 

French «<ord list 


xsaxxxsasxsssxxxssssx 

English word list 




lowast scors 


.08 


.25 


francophones 


hlghsst score 


.80 


.91 


(I « 126) 


Man 


.61 


.70 




■sdian 


.63 


.7'i 


anglophonss 
<I * 87) 


lowest score 
highest score 
aean 
■edlan 

!«ssx3S=MMSssaisssa:s 


.40 

.83 
.63 
.63 

sssszsatKssatxxxBxssssaxa 


.51 
.90 
.76 
.77 
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VrwBdh ward list Bagllsh word list 



Jour 


word 




aan 


bMU 


firat 




thins 


•nfant 


good 


travail 


plctura 


tAta 


laportant 


palna 


young 


livra 


tabla 


prlx 


nachlna 


salla 


wlntar 


bC3Ut 


raal 


rol 


sugar 


faclla 


busy 


dur 


clock 


chArl 


paaca 


conaall 


basis 


Atranga 


tarribla 


doubla 


traffic 


arrive 


pu*p 


■allwur 


coda 


pAla 


boab 


fruit 


nasslva 


aarcbandlaa 


cupboard 


raetaurant 


pappar 


culta 


CDuragaous 


raMdda 


axscutlon 


tlrolr 


razor 


caap 


bulldog 


Idylla 


lyra 



(decreasing frequencies) 
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